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Abstract 

Singular spectrum analysis (SSA) is considered for decomposition of time series into identifiable com¬ 
ponents. The Basic SSA method is nonparametric and constructs an adaptive expansion based on singular 
value decomposition. The investigated modification is able to take into consideration a structure given in 
advance and therefore can be called semi-nonparametric. The approach called SSA with projection includes 
preliminary projections of rows and columns of the series trajectory matrix to given subspaces. One ap¬ 
plication of SSA with projection is the extraction of polynomial trends, e.g., a linear trend. It is shown 
that SSA with projection can extract polynomial trends much better than Basic SSA, especially for linear 
trends. Numerical examples including comparison with the least-square approach to polynomial regression 
are presented. 


1 Introduction 

Singular spectrum analysis (SSA) can solve a wide range of problems in the time series analysis, from the 
series decomposition on the interpretable series components to forecasting, missing data imputation, parameter 
estimation and many others, see, e.g., and references within. The key feature of SSA is that the basic 

method is model-free, does not need a-priori information and therefore constructs an adaptive decomposition 
of a time series into a sum of e.g. a non-parametric trend, periodic components and noise (see ll5l[T7ll2i rT4l[T5l 
among others for application of SSA to the problem of trend extraction). This can be considered as a great 
advantage of the SSA-family methods for comparison with parametric ones. However, sometimes there is 
a-priori information about the considered time series. For example, the trend can be expected as linear or 
polynomial. 

In @ Section 1.7.1], SSA with single and double centering is developed to extract constant or linear trends 
with better accuracy. We generalize this approach. Approaches, which deal with a combination of parametric 
and nonparametric models, are sometimes called semi-parametric if the parametric part of the model is of 
interest and semi-nonparametric if both parts are important, see the references [Ji] and iflOl as examples of such 
approaches to statistical econometric problems. 

Let us explain the motivation for the suggested approach, which can be considered as a semi-nonparametric 
variation of singular spectrum analysis. 

In SSA, the separability theory is responsible for a proper decomposition and component extraction. The 
separability of a series component means that the method is able to extract this time series component from the 
observed series, which is a sum of many components. Basic SSA is able to approximately separate a trend (e.g., 
a linear trend) from oscillations. However, there is no series, which can be exactly separated from a linear trend. 
As a consequence, the separation accuracy is not high. It is shown in @ Sections 1.7.1 and 6.3.2] that SSA 
with double centering weakens the separability conditions and therefore improves the accuracy in conditions 
of approximate separability. Thus, it is expected that, within the SSA-family methods, SSA with projection 
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can improve separability for components of a specific structure, which is in accordance with the projection 
subspaces. 

In comparison with the linear regression technique (we will further mean the least squares approach to 
the estimation of regression parameters), SSA with double centering differs by the statement of the problem. 
Linear regression minimizes the prediction error, while SSA tries to separate the series components themselves 
using their orthogonality. For example, for a series with common term = tn + s„, where tn = an+ b and 
Sn = Asm{2n(On + <p), the least-squares approach generally cannot estimate the linear trend with no error, 
while in the conditions of separability SSA with double centering is able to find fhe exacf linear frend. For long 
lime series, fhe linear regression and SSA yield close eslimales of fhe linear frend. Nole lhal for fhe case of 
approximale separabilily fhe frend found by SSA wilh double centering will be only close fo a sfraighf line, 
while fhe linear regression always provides a linear funclion as a frend eslimalion. The analogous relafion 
befween fhe paramefric regression and SSA wilh projeclion is expected for fhe general case of polynomial 
Irends. In particular, we can suppose lhal for lime series wilh seasonality fhe ‘SSA wilh projeclion’ melhod 
will be able lo exlracl linear and polynomial trends more accurately than the parametric regression approach. 
It is important that the use of projection on a fixed basis does nol conlradicl fhe non-paramelric nalure of SSA. 
Moreover, if fhe basis is chosen incorreclly, Ihe decomposilion will be nol oplimal and Ihe Irend eslimale will 
be less accurate; however, Ihe eslimale will nol have a considerable bias, since il can be accomplished by 
componenls of Ihe adaplive part of Ihe whole decomposilion. This is nol Ihe case for Ihe paramefric approach. 

The Basic SSA melhod consisls of Irajeclory matrix construction from the original time series, its decom¬ 
position into a sum of rank-one matrices by SVD, their grouping and then each group’s return to time series 
to obtain a decomposition of the original time series into a sum of identifiable componenls. The grouping of 
Ihe SVD can be considered as a projeclion of Ihe Irajeclory malrix columns on a subspace, which is adaplively 
conslrucled based on Ihe dislinguished fealures of SVD decomposilion. SSA wilh projeclion slarls wilh pro- 
jeclions of Irajeclory malrix columns and rows on subspaces chosen in advance and Ihen decomposilion of Ihe 
residual, by Ihe same way as in Basic SSA. In particular, SSA wilh double centering uses Ihe projeclions on Ihe 
subspaces spanning Ihe vectors wilh elemenls equal to 1. A nalural applicalion of SSA wilh projeclion, which 
is moslly considered in Ihis paper, serves for exlraclion of polynomial Irends; however, Ihe suggested melhod 
can be applied to a wider range of problems, e.g., for Ihe use of informalion aboul a supporling series. 


The slruclure of Ihe paper is as follows. We slarl wilh a shorl descriplion of Ihe algorilhm of Basic SSA 
and slandard separability nolion (Seclionj^. Seclion|^is devoted to generalizing centering used in SSA and 
conlains Ihe underlying Iheory, including Ihe proof of Ihe algorilhm and Ihe separability condilions. Seclion]^ 
demonslrales Ihe examples of Ihe algorilhm applicalion for frend exlraclion. The real-life examples are sludied 
in Seclions |4~T] andto show Ihe relalion befween Basic SSA, SSA wilh projeclion and Ihe linear regression 
(leasl-squares) approach. Numerical comparison is performed in Seclion 4.3 The paper is summarized and 
conclusions are drawn in Seclion^ 


2 Necessary information 

2.1 Algorithm of Basic SSA 

Consider a real-valued lime series X = = (xi,... ,xn) of lenglh N. Lei L (1 < L < V) be some integer called 

window length and K = N — L+\. 

For convenience, denote the space of malrices of size L x K and Ihe space of Hankel malrices 
of size Lx K. Consider Ihe lagged vectors A,- = (x,-,... / = !,..., A, and Ihe trajectory matrix X = 

[Ai : ... : Xk] G of Ihe series Xn. 

Define Ihe one-to-one embedding operator T: i—)• as ^'{Xn) = X. Also inlroduce Ihe projector 

(JJ\ 

(in Frobenius norm) of to Projeclion is performed by Ihe change of enlries on auxiliary diagonals 

i + j = consl to Iheir averages along Ihe diagonal. 

The Basic SSA algorilhm consisls of four steps. 
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1st step: Embedding. Let L be chosen. At this step the L-trajectory matrix is composed: X = T(Xjv). 

2nd step: Singular Value Decomposition (SVD). The SVD of the trajectory matrix is constructed: 

d 

X = £ = Xi +... + Xrf, (1) 

/=! 

where '/Xi are singular values, Ui and V are left and right singular vectors of X, Ai > ... > > 0, J = rank(X). 

The triple Ui, Vi) is called ith eigentriple (abbreviated as ET). 

3rd step: Eigentriple grouping. The grouping procedure partitions the set of indices {1,... ,r/} into m 
disjoint subsets A,... ,7^. 

Define X/ = The expansion Q leads to the decomposition 

X = X/.+... + X/„. (2) 

If m = d and 7/ = {j}, j = I,... ,d, then the corresponding grouping is called elementary. 

4th step: Diagonal averaging. Obtain the series by diagonal averaging of the matrix components of Q: 
= T- 1 ,^X 4 . 

Thus, the algorithm results in the constructed decomposition of the observed time series 

Xyv=£xW. (3) 

k=\ 

A typical example of Q is the decomposition into a sum of a trend, oscillations and noise. 

Remark 1. Columns of a grouped matrix X/ are the projections of columns of the trajectory matrix X to 
span(7/,', i G I). Rows ofXj are the projections of rows ofX to span(V;', i G I). 

2.2 Separability by Basic SSA 

To understand how SSA works, the notion of separability is very important. Separability of two time series 
X^'^ and X^^ signifies fhe possibilify of exfracfing xjj^ from fhe observed sum X^r = X^^^ +X^^ This means 
fhaf fhere exisfs a grouping af Grouping sfep such fhaf X]^ = X]^ . 

By properties of fhe SVD, fhe separabilify is concluded in fhe orfhogonalify of fhe column and row spaces 

of fhe frajecfory mafrices of fhe series X^^^ and X^^. In fhe case of approximate (asympfofic) separabilify 

^(k) (k) 

X]y X]y we obfain fhe condition of approximafe (asympfofic) orfhogonalify. 

For sufficienfly long time series, SSA can approximafely separate, for example, a signal and noise, sine 
waves wifh differenl frequencies, a frend and a seasonably 161171. 

The inlroduced separabilify, which is called weak separabilify, means fhaf al fhe SVD step fhere exisfs 
such an SVD fhaf allows fhe proper grouping. Slrong separabilify means fhaf each SVD decomposition allows 
fhe proper grouping. Several nonparamelric modifications of SSA for improvemenf of fhe weak and slrong 
separabilify are considered in Q- In this paper we will improve the separability by a semi-nonparametric 
variation. 

2.3 Series of finite rank and series governed by linear recurrence relations 

Let us describe the class of series of finite rank, which is natural for SSA. In particular, only such time series 
can be exactly separated by Basic SSA. 

Define fhe L-rank of a series as fhe rank of ils L-lrajecfory malrix. Series wifh rank-deficienl frajecfory 
mafrices are of special inleresl. A lime series is called time series of finite rank r if ils L-lrajeclory malrix has 
rank r for any L > r (if is convenienl lo assume fhaf L < K). We will call fhe column and row spaces of fhe 
frajecfory mafrices column and row spaces of fhe series respeclively. 
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Under some unrestrictive conditions ||6j Section 5.2], series of finite rank r is governed by a linear 
recurrence relation (LRR) of order r, that is, 

r 

^i+r = X akSi+r-k, l<i<N-r, (4) 

yt=l 

The LRR Q is called minimal and r is called the dimension of the series. Let us describe how we can 
restore the form of the time series by means of the minimal LRR. 

Definition 1. The polynomial Pr{l-i) = — Yfk=i 1^ called a characteristic polynomial of the LRR Q. 

Let the time series Soo = (i' i,..., ,...) satisfy the LRR Q with / 0 and / > 1. Consider the characteristic 
polynomial of the LRR Q and denote its different (complex) roots by /ii,..., Pp, where p <r. All these roots 
are non-zero as a^ 0. Let the multiplicity of the root /i,„ be where 1 < m < p and k\ + .. .+kp = r. We 
will call pj characteristic roots of the series governed by an LRR. 

It is well-known that the time series Soo = (i'l,..., ...) satisfies the LRR Q for all / > 0 if and only if 

p /k„,-l \ 

■«« = £ L c,njn^ /i”, (5) 

m=l \;=0 / 

where the coefficients Cmj are determined by the first r series terms. For real-valued time series, Q implies that 
the class of time series governed by the LRRs consists of a sum of products of polynomials, exponentials and 
sinusoids. 

Rank of the series is equal to the number of non-zero terms in ([^. For example, an exponentially-modulated 
sinusoid Sn = sin(27rft)n -|- 0) is constructed from two conjugate complex roots Pi 2 = 
if its frequency ft) G (0,0.5). Therefore, the rank of this exponentially-modulated sinusoid is equal to 2. The 
rank of an exponential is equal to 1, the rank of a linear function corresponding to the root 1 of multiplicity 2 
equals 2, and so on. 

Also, the representation Q helps to easily construct the bases of trajectory spaces of complex time series 
governed by LRRs: they are constructed from the linearly independent vectors (O^ p ^^, 1 ,..., (L — 1) Vii *) ^ • 
For linear series, the basis consists of (1,..., 1)^ and (0,1,2,... ,L — 1)^. 

3 SSA with projection 

Let us consider a time series X of length N, a window length L, K = N — L + I, the trajectory matrix X of the 
series X. 

A general form of the considered modification can be expressed as 

• Calculation of a special matrix C = Cx based on a-priori information. 

• Computation of X' = X — C. 

• Construction of the SVD: X' = j 

Thus, we have the decomposition X = C -|- (U/)^. 

Centering, which is a particular case of the general scheme, is considered in the following forms 

1. Single row centering when C corresponds to averaging by rows, that is, each element of a row of C 
consists of the average of the corresponding row of the trajectory matrix. 

2. Single column centering when C corresponds to averaging by columns. 

3. Double centering when C corresponds to averaging by both rows and columns. 
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Single centering can be considered as a projection of rows or columns of X on span(£'M), where Em = 
(1,...,!)^ G R^, M is equal to L or K. Therefore, centering in SSA can be considered as a preliminary 
projection of the trajectory matrix on a given subspace; the residual matrix X' will be subsequently expanded 
by SVD or any other decomposition. 

Let us generalize this approach to projections to arbitrary spaces. Denote a basis of the column projection 
space {Pi,i = and/or a basis of the row projection space {Qi,i = Let ITcoi: R^ —?■ span(/^',/ = 

and rirow : R^ —^ span(2,-, i=l,...,q)be orthogonal projectors. For any Y G Mlj, denote ITcoi (Y) the 
matrix consisting of the columns, which result from projections of the columns of Y, while for any Y G M? 
denote IlrowlY) the matrix consisting of the rows, which result from projections of the rows of Y. 

In SSA with projection, the scheme of SSA with centering is extended to arbitrary projections, that is, 
C = ncoi(X) for column projection, C = nrow(X) for row projection and C = nboth(X) for double projection, 
where nboth(X) = nrow(X) +ncoi(X — rirow (X)). If either the column or row basis is absent (that is, the 
corresponding projection should not be performed), then we formally set the corresponding projector to be the 
zero operator implying C = ITboth (X) for any mode. 

Note that the method of SSA with projection differs from Basic SSA only in the Decomposition step: 

X = C + £^l//(l//)\ (6) 

where is the SVD of X — C. Let us show that Q can be represented as a sum of elementary 

matrices and therefore Reconstruction steps can be performed in the same way as done in Basic SSA. 

Without loss of generality we assume that {Pi,i = 1,.. .,p} and {Qi,i = I,... ,q} are orthonormal systems 
(otherwise, we can perform ortho-normalization). Denote P = [Pi : ...: Pp], Q = [Q\ : ...: Qq], Then ncoi(Y) = 
ppT Y = Pi{Y'^Pif and n,o^{Y) = YQQ^ = j:ti{YQi)Qj. Since C = nboth(X) and can be expressed as 
a sequential application of the projection operators ITrow and ITcoi, (|^ is a decomposition of X on elementary 
matrix components unambiguously defined. For double projection, this representation depends on the order of 
projections; we will apply the row projector first. 

Thus, the matrix C can be considered as a sum of p + q elementary matrices of the forms a-‘^^PiQj, 

i = l,...,p, and al''^PiQj, i = f,...,q. The triples {ol^\Pi,Qi) and {a-''\Pi,Qi) have the same meaning as 
eigentriples. 

The Reconstruction stage is exactly the same as in the Basic SSA method. Note that it makes little sense to 
include the eigentriples produced by projections to different groups, since the projections are performed on the 
subspaces as a whole. 

3.1 Appropriate class of time series 

For SSA with projection, a known series component with a trajectory matrix Y should be in agreement with 
projection so that ITcoi(Y) = Y for column projection, nrow(Y) = Y for row projection and nboth(Y) = Y for 
double projection. 

Clearly, for column and row projections, this is true if the corresponding projection is performed on the 
column or row trajectory space of the known series component. For example, the trajectory space of an ex¬ 
ponential component Sn = ft” spans (l,/i,... , while the trajectory space of a linear function Sn = an+ b 

spans (1,1 ,..., 1) and (1,2,... ,L)^ for any b and non-zero a. 

Let us derive a condition sufficient for ITboth(X) = X to hold for the general case of the double projection. 

Lemma 1. Let nrow(Q^) = Q^. ncoi(P) = P/or P G Jfipp and Q G Then nboth(X) = Xfor 

X = PQT + PQT, (7) 


where P G and Q G 
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Proof. By the assumption, nrow(AQ^) = AQ^ for any t and matrix A G while ncoi(PB^) = PB^ for 
any matrix B G Therefore, 

HbothX = pqt+ n.owlPQ'^) + PQ'^+ncoi(PQ^) 

-ncoi(n,,w(PQ^+PQ'')) = x, 

since ITcoi o TIj-ow = TI]-ow ® TIcoi ■ n 

It is easy to check that the trajectory matrix of a linear series satisfies the conditions of Lemma[T]for the case 
of double centering. However, for a general case the approach based on characteristic roots is more convenient. 
We start with a technical lemma. 

Lemma 2. For any polynomial of order d and for any m and I such that m + l = d — \ the following expansion 

can be constructed: 

Pd{i + j)= Pm,d {i, j) + Pdd {i, j ), 
where Pu.v{hj) denotes a polynomial ofi and j of order {u,v). 

Proof This lemma is proved by an appropriate grouping of the monomials Cp^qF 7 ^, p + q < d, of Pd{i + j). 

□ 

Recall that a series governed by an LRR, whose characteristic polynomial has the given set of roots called 
characteristic roots, is of the form Q. 

Theorem 1. Let a series {m = \,2) be governed by an LRR of order Yf"*) be its trajectory matrix. Let 
{Pj', j = I,... ,s} be the set containing the characteristic roots of both series. Assume that has roots pj, 
j = 1,... , 5 , with multiplicities > 0, ^ 7=1 Let ITcoi be the projector on the column space ^ of 

Y('), fli-ow be the projector on the row space ofY^^\ Hboth = TIcoi + Hj-ow — ITcoi o TIrow- Then nboth(X) = X 
if and only if the set of characteristic roots of the series X consists of the roots pj, j = I,... ,s, of multiplicities 

dj < df +df\ 

Proof. Due to linearity of projectors and linear dependence of Hboth on Hrow and Hcoi, it is sufficient to 
prove the theorem for the case of one root p. Let Y^^^ have the characteristic root p of multiplicity p, Y^^) have 
the characteristic root p of multiplicity q. 

Thus, we should prove that Hboth(X) = X if and only if the series X has the form Xk = Pt{k)p^, where 
f<p + ^—l.Itis sufficient to take t = p + q—l. 

By Lemma 

Pp+q-\ {I + j)P 

— Pp-l,p+q-[{lij)P A Pp+q-i,q-l{li j)P P^ ■ 

This means that Q holds for Q G JfiK^q and P G such that the column space of Q coincides with and 
the column space of P coincides with 

Since the dimension of the space of trajectory matrices that are kept by the projector Hboth is equal to 
r = ri + r 2 , we found all such matrices. This completes the proof. □ 

Corollary 1. Let Y be a series of dimension r, Y be its trajectory matrix, Hrow be the projection on its row 
trajectory space, Hcoi be the projection on its column trajectory space. Consider the series X with = {an + 
b')yn- Then Hboth(X) — X, where Hboth — Hj-ow T Hqoi Hj-ow o HqoI- 

Remark 2. Note that multiplication of a series by an+ b means that the multiplicities of its characteristic roots 
increase by 1. 

Corollary 2. Let Hjow be the projection on the row trajectory space of the polynomial of order m, Hcoi be the 
projection on the column trajectory space of the polynomial of order k. Then for the polynomial X = Pm+k+\ of 
order m + k+\ we have Hboth (X) = X. 
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3.2 Separability 


We expect that if a time series component is governed by a minimal LRR and this LRR is known, then the 
series component can be separated by a suitable version of SSA with projection better than it can be done by 
Basic SSA. 

Using the notion of separability, we can formulate this improvement as follows. Let X = AX^^^. We 

will say that a time series component X^^) is separated by SSA with projection if X^'^ = C, where C is as in 

©■ 

LetX(i)be a series of finite rank, X = X^^) +X(^). Similar to O, where conditions for separability by SSA 
with centering are considered, the following conditions of separability can be obtained. 

1. Basic SSA: 

X^^) and X^^) are separable if (if and only if, by definition) their row and column spaces are orthogonal. 

2. SSA with row projection on the row space of X^^^: 

X^^) and X^^) are separable if their row spaces are orthogonal. 

3. SSA with column projection on the column space of X^^): 

X^^) and X^^^ are separable if their column spaces are orthogonal. 

4. SSA with double projection on the row and column space of Y, where X^^) and Y are such that = 
{an + b)yn, a / 0 : 

X^^^ and X^^) are separable if the row and column spaces of Y and X^^^ are orthogonal. 

Note that the separability by SSA with projection is always strong, since projections on linear spaces are 
uniquely defined. 

For the approximate separability, where X^^^ « C, the approximate orthogonality is necessary. Also, the 
asymptotic separability can be considered by analogy with the conventional separability for Basic SSA and 
SSA with centering. 

Recall that the usual double centering in SSA corresponds to a constant series Y and therefore to a linear 
series X^^^. Orthogonality to a constant series is a much weaker condition than that to a linear series (moreover, 
the condition of orthogonality to a linear series can never be exactly satisfied). In particular, any sinusoid with 
frequency ft) is asymptotically separable from the linear trend and the exact separability by SSA with projection 
takes place if L (0 and Kco are integers, that is, if L and K are divisible by the period of the sinusoid. Therefore, 
for extraction of linear trends, the double centering is recommended. 

In the case of a polynomial trend of degree larger than 1, the conditions of exact separability cannot be 
satisfied at all, even for SSA with double projection. However, we still can expect that in the case of polynomial 
trends, SSA with double projection also will work better than SSA with only row or column projections and 
also better than Basic SSA. 

3.3 Algorithm 

Let us summarize the steps of SSA with projection in the form of algorithms, splitting the whole algorithm into 
decomposition and reconstruction. 

Algorithm 3.1: SSA with projection: decomposition 
Input: The time series X of length N, the window length L, an orthonormal basis of the column projection 
space {Pip = 1,... ,p) and an orthonormal basis of the row projection space {Qi,i = Either p or 

q can be zero. 

Output: Decomposition of the trajectory matrix on elementary matrices X = Xi + ... + X,^, where X,- = 
y/OiUiV^ are rank-one matrices. 

1 : Construct the trajectory matrix X = ‘Jssa(X). 
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2 : Subtract the row projection: X' = X — C, where 


C = n„w(X) = f 
!=1 

aW = ||Xa'||,^ = X(2,-/aW. 

3: Subtract the column projection: X" = X' — C', where 

C' = ncoi(X') = f 

i=l 

= ||X'T/’,'||, Qi = 

4: Construct a decomposition X" = Yfi =\where X" = ^J^U”{y”Y\ it can be performed by Decomposi¬ 
tion step of Basic SSA. 

5: As a result, X = where d = p + q + d”, X,- = oj'^^PiQj for i = \,...,q, X,+g = o-^^^PiQj for 

i = 1,..., p, and X,_|_p_|_^ = ^Ul'iV'rfoTi = l,...,d". 

Similar to Basic SSA, SSA with projection provides a decomposition on matrices orthogonal by Frobenius; 
therefore, contributions of X,- are given by ||X,jp/||X|p. However, the obtained decomposition into a sum of 
rank-one matrices can be non-minimal (their number is larger than the rank of X), if at least one basis vector 
used for the projections does not belong to the column (row) trajectory space. 

Algorithm 3.2: SSA with projection: reconstruction 

Input: Decomposition X = Xi X^ and grouping {1,..., r/} = UyLi which does not split the first p + q 

projection components, where q and p are the numbers of row and column projection components. 

Output: Decomposition of time series on identifiable components X = Xi -|- ... -|-X„,. 

1: Construct the grouped matrix decomposition X = X/, -b... -bX/^,, where X/ = L’e/X/. 

2: Compute X = Xi -b ... -bX^, where X; = T^',^(X/.). 


The only essential difference with the reconstruction by Basic SSA is that the set of the matrices X,, i = 
I,... ,p + q, produced by projections, should be included in the same group. 

Note that formally, the sets {Pi, i = l,...,p} and {Qi,i = I,... ,q} can be arbitrary. However, if the model 
of the series is partly known, then in the context of SSA this means that a time series component satisfies an 
LRR and we know its characteristic roots (see Section 2.31. Therefore, to extract, for example, a sine wave 
using projections, we should know its period, and to extract an exponential trend, we should know its rate. 
These conditions are often too restrictive. A clear exception is extraction of polynomial trends of a degree m, 
when there is the unique characteristic root equal to 1 of multiplicity m -b 1 and we should assume only the 
degree of the polynomial trend to obtain its trajectory space. 


4 Examples 

The presented examples are related to finding polynomial trends. For convenience, if the row and column 
projections are performed on the subspace generated by polynomials of degree q—l and p — l respectively, then 
we denote the method as ProjSSAfi^,/?). Recall (see Corollary that the choice ProjSSA(^,p) corresponds to 
extraction of a polynomial trend of degree q+p — \. In ProjSSA(( 7 ,p), the projection part of the decomposition, 
i.e., the matrix C in Q, consists of p-b <7 rank-one matrices. ProjSSA(l,l) is used for extraction of a linear 
trend. The zero value for p or q means that the corresponding projection is not performed. 

All the examples are implemented in R |[T^ with the help of the RSSA package Ifl^ . For example, to 
perform ProjSSA(( 7 ,p) for a time series taken from the variable x with a window length L, the following code 
should be called: 



s <- ssa (x, L = L, 

row.projector = q, 
column.projector = p) 
r <- reconstruct (s, 

groups = list (trend = 1:nspecial (s))) 
plot (r, add.residuals = FALSE, 

plot.method = "xyplot", superpose = TRUE) 

For more details on RSSA, see the help files in ifT^ . 

4.1 SSA with projection and regression 

Let us demonstrate that the conventional linear regression and SSA with double centering, i.e., ProjSSA(l,l), 
use different statements of the solved problem and therefore can yield different results. It is clearly seen in 
short time series. For long time series the results are very close. Also, in the model of linear regression with 
Gaussian noise, the regression solution is optimal. Therefore, to demonstrate the difference, we consider a time 
series, which contains a seasonal component. 

Here we examine the time series ‘Gasoline’ taken from iT!] and containing the data GASOLINE DEMAND, 
MONTHEY, Jan 1960 - Jun 1967, ONTARIO, GAEEON MIEEIONS. 


Regression - Series - 

ProjSSA - Trend — — 



Eigure 1: ‘Gasoline’: SSA with projection, linear trend detection. 

Eet us consider the first two years and apply the linear regression and ProjSSA(l,l) with L = 12. To show 
the difference, we continue the linear regression line with the help of the estimated coefficients. In the RsSA, 
a method of forecasting for SSA with projection is implemented. Since it is not proved yet, we will construct 
the forecast by a linear regression applied to the reconstruction, which is performed by ProjSSA(l,l). Note 
that the forecasting procedure from RsSA provides a similar prediction. As a benchmark, the linear regression 
constructed by the whole series is considered. 

One can see in Pigure[T]that the ProjSSA(l,l) linear trend (blue) is very close to a linear trend constructed 
by the whole long time series (green). The linear regression line (red) gives a much worse approximation of the 
trend. This is explained by the following reasons. The least-squares approach to the linear regression estimation 
minimizes the prediction error and therefore the seasonal component can shift the linear regression trend. Eor 
ProjSSA(l,l), the seasonal component is well separated from the linear trend, since for the chosen parameters 
L = K = \2 we. divisible by the seasonal period 12. 
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4.2 SSA with projection and Basic SSA 


The example introduced in this section demonstrates that both SSA with projection and Basic SSA can extract 
trends in a similar manner. Let us consider the example ‘co2’ (Mauna Loa Atmospheric CO 2 Concentration, 
468 observations, monthly from 1959 to 1997 ifTTl ). 


Original 
Linear trend 


Original - - - 
Trend - 



Time 


Original - 

Smoothing - 



Time 


Original - 

Cubic trend - 



Figure 2: co2: Reconstructions of the trend. Left-top: ProjSSA(l,l), L = 228; right-top: ProjSSA(l,l), L = 
228, complemented by the ET 5 and 8; left-bottom: ProjSSA(l,l), E = 36; right-bottom: ProjSSA(2,2), L = 
228. 

We start with extraction of the linear trend and therefore choose ProjSSA(l,l) to perform SSA with double 
centering. 

By analogy with SSA, large window lengths help to extract separable series components, while small 
window lengths correspond to smoothing. Therefore, we take L = 228, which is divisible by 12 and is close to 
half of the time series length to obtain better separability, and a small value L = 36 to smooth the series. Three 
of four versions of the extracted trends presented in Figure [^almost coincide. 

For the choice L = 228, the extracted trend is close to linear, see Figure|^(left-top). Certainly, the accurate 
trend of ‘co2’ series is not linear. However, the projection components can be supplemented by the 1st and 4th 
SVD components (ET5,8) to improve the trend (Figure]^ (right-top)). Figure(left-bottom) shows the result 
of smoothing with L = 36. Finally, the result of ProjSSA(2,2) with L = 228, which is designed for extraction 
of a cubic trend, is depicted in Figure [^(right-bottom). The extracted trend is very similar to that in |0, which 
was extracted by Basic SSA (not depicted). 

Identification of the components in the decomposition produced by SSA with projection is exactly the same 
as it is performed in Basic SSA. 

4.3 Numerical comparison 

The real-life examples presented in Sections |4T] and |4~2] show that the results of Basic SSA, SSA with projection 
and linear regression can be either different or similar. To understand, what method is better, let us perform a 
numerical study. 
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We consider a time series of length N = \99 with the common term 

( 8 ) 

where is a trend, Sn= A sin(27rnG) + 0), e„ is a Gaussian white noise with standard deviation a. 

For obtained estimations tn \ where i is the number of series with /th realization of noise Ei'\ / = 1,... ,M, 
we will calculate the root-mean-square error (RMSE) as 

Linear trend and sine wave. Let us start with the noiseless case with a = 0 and therefore take M = 1. Let 
tn =an + b. We f^xa = \,b = —100, A = 1 and change ft) from 0.02 to 0.1 (that is, the period is changed from 
50 to 10). 

Since the result of the least-square method strongly depends on the form of the residual, we consider the 
values of the phase, 0=0 and 0 = 7r/2. 

Ligure(left) contains the RMSE values in the case 0 = 0 for Basic SSA with reconstruction by ETl-2, 
ProjSSA(2,0), ProjSSA(l,l) with L = 100, and for the linear regression. One can see that the worse cases for 
ProjSSA(l,l) are approximately equal to the best cases for the linear regression. 


Linear regression 

Basic SSA - 

ProiSSA(2,0) - 

ProjSSA(1,1) - 



(0 


Linear regression —— > 

Basic SSA regr - 

ProjSSA(2,0) regr - 

ProjSSA(1,1) regr - 



(0 


Ligure 3: Dependence of the RMSE of linear-trend estimates on frequency of the periodic component. 0=0. 



Eigure 4: Dependence of the RMSE of linear-trend estimates on frequency of the periodic component, 0 = 7r/2. 


In Section 4.1 we perform forecasting by the linear regression applied to the trend reconstruction. Eigure 
(right) contains the RMSE for the linear regression lines constructed in this way; ‘regr’ is added to the legend. 
The ordering of the SSA methods is generally the same, while the SSA methods become better than the linear 
regression. Probably, 0 is one of the worst values of 0 for linear regression. 

Now consider 0 = 7r/2 as one of the best cases for the linear regression. The behavior of the errors is 
quite different (Eigure (left)). However, the accuracy of ProjSSA(l,l) is still better than that of the linear 
regression. Linear least-square approximation of the SSA reconstructions considerably improves the accuracy 
of the SSA methods (Eigure]^ (right)). 
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Note that zero values of the RMSE for ProjSSA(l,l) for frequencies (O = 0.01^ are explained by the the¬ 
ory, since then Leo and Kco are integers. The errors for ProjSSA(2,0) lie between that for Basic SSA and 
ProjSSA(l,l). It is interesting that the minimal errors for Basic SSA are achieved for the middle points, when 
Leo + 0.5 and Kco + 0.5 are integers. 

Cubic trend and sine wave. Let us consider a more complex case of the cubic trend = O.OOOln^. Since 
there is no exact separability for any choice of parameters, the results are unpredictable. Figures]^ (left) and[^ 
(left) contain the RMSE values for Basic SSA with reconstruction by ETl^, ProjSSA(4,0), ProjSSA(2,2) with 
L = 100 and for the cubic regression. One can see that ProjSSA(2,2) is the best method for 0=0, while it is 
just comparable with the linear regression for 0 = 7r/2. Note that here the best parameters for ProjSSA(2,2) 
do not correspond to the case when Leo and Keo are integers. The cubic least-square approximation of the 
reconstructed trend again improves the estimates (Figures]^ (right) and(right)). 


Linear regression 

Basic SSA - 

ProjSSA(4.0) - 

ProjSSA(2,2) - 



Linear regression 

Basic SSA regr - 

ProjSSA{4,0) regr - 

ProjSSA(2,2) regr - 



0 


Figure 5: Dependence of the RMSE of cubic-trend estimates on frequency of the periodic component, 0=0. 



Figure 6: Dependence of the RMSE of cubic-trend estimates on frequency of the periodic component, 0 = 7r/2. 

Basic SSA fails for the chosen parameters because of lack of strong separability: the fourth trend component 
has a contribution comparable with the contribution of the periodic components that causes their mixture. 

Note that one of the modifications described in fQj, Iterative 0-SSA, can be used to get strong exact separa¬ 
bility for the considered noiseless examples. However, we do not involve this modification into the comparison, 
since Iterative 0-SSA is not able to remove noise and should be applied after denoising in nested manner, while 
the compared methods are able to extract the trend without denoising. 

Linear trend and noise. For the data which satisfy the model of the linear regression with white Gaussian 
noise, that is, for the amplitude A equal to zero, we take a = 1 and use M = 1000. As expected, the smallest 
error 0.10 is achieved for the regression estimate. However, the RMSE of the ProjSSA(l,l) estimate equal to 
0.12 is very close to 0.10. The error of the Basic SSA is equal to 0.17. Application of linear regression to 
the results of SSA reconstruction improves the SSA estimates. The RMSE for ProjSSA(l,l) and Basic SSA 
become equal to 0.115 and 0.104 respectively. 
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We do not show the results when the series has both periodie eomponent and noise, sinee the errors are 
intermediate. To keep the advantage of SSA with projeetion, the noise standard a should be eonsiderably 
smaller than the amplitude A of the periodie eomponent. 

5 Conclusion 

The eonsidered eombination of singular speetrum analysis, whieh does not need a series model given in ad- 
vanee, and of a subspaee-based parametrie approaeh, whieh is ineorporated by means of projeetions to sub- 
spaees given in advanee, proves sueeessful for extraetion of polynomial (espeeially, linear) trends, when the 
residual has unknown strueture and ean inelude deterministie oseillations, e.g., the seasonality. 

The general form of projeetions of eolumns and rows of the trajeetory matrix, whieh keeps this trajeetory 
matrix, was obtained. It was proved that projeetions to the row and eolumn subspaees (so-ealled double pro¬ 
jeetion) of the trajeetory matrix of a series Y are related to extraetion of the series {an + b)Y. In partieular, the 
linear trend ean be obtained by double projeetion to the eolumn and row subspaees of a eonstant series. The 
formulated eonditions of separability of a series eomponent, whieh is kept by projeetions, show that if a series 
eomponent ean be represented in the form {an + b)Y, then the double projeetion is preferable. 

Thus, the theory provides an additional theoretieal support to SSA with double eentering (ProjSSA(l,l)), 
whieh was known before, and also enlarges the range of applieations of semi-nonparametrie modifieations of 
Basie SSA. 

Applieations of SSA with projeetion eonsidered in the paper were related to the extraetion of a polynomial 
trend, sinee its trajeetory spaee is determined by the polynomial degree only. 

We showed on the example ‘Gasoline’ that the linear regression approaeh ean be inadequate for short series 
and large oseillations, in eomparison with ProjSSA(l,l). Comparison of different SSA versions applied to 
the ‘eo2’ data demonstrates that even if the model of a series eomponent used for projeetion is wrong, the 
non-parametrie part of SSA with projeetion ean eorreet the bias. 

A numerieal study was performed for a better understanding of the differenee between SSA with projeetion 
and the linear regression approaeh. First, it appears that if we extraet a polynomial trend by SSA with projee¬ 
tion, then the polynomial least-squares approximation of the trend reeonstruetion ean eonsiderably improve the 
aeeuraey. 

The seeond found effeet is related to the influenee of the residual geometry on the estimate aeeuraey. In 
the eonsidered example, we ehanged the phase of a sinusoid. The SSA estimates slightly depend on the phase, 
while the regression estimates demonstrate a eonsiderable dependenee. 

Numerieal experiments eonfirm that for a linear trend and a sine wave residual, ProjSSA(l,l) is more 
aeeurate than the linear regression estimate. For a noisy linear trend, when the model of the linear regression if 
fulfilled, the linear regression estimate is slightly more aeeurate than SSA. Thus, we ean formulate eonditions, 
when SSA with double projeetion ean be reeommended for use: series has a linear or polynomial trend (the 
polynomial degree is not large) and the regular oseillations are eonsiderably larger than the noise level. 

The further investigation ean be performed in two direetions. First, the foreeasting algorithm for ProjSSA(m,k) 
implemented in RSSA should be proved. Then, the idea to use projeetion to involve the strueture of a supporting 
series looks promising. 
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